Accelerating the RTTOV-7 IASI and AMSU-A Radiative Transfer Models on Graphics Processing Units: Evaluating CPU/GPU-Hybrid and Pure-GPU Approaches
نویسندگان
چکیده
The Radiative Transfer for TOVS (RTTOV) is a widely-used radiative transfer model (RTM) for calculation of radiances for satellite infrared and microwave sensors, including the 8461-channel Infrared Atmospheric Sounding Interferometer (IASI) and the 15-band Advanced Microwave Sounding Unit-A (AMSU-A). In the era of hyperspectral sounders with thousands of spectral channels, the computation of the RTM becomes more time-consuming. The RTM performance in operational numerical weather prediction systems still limits the number of used channels in hyperspectral sounders to only a few hundreds. To take the full advantage of such high-resolution infrared observations, a computationally efficient radiative transfer model is needed to facilitate satellite data assimilation. In this paper, we develop the parallel implementation of the RTTOV-7 IASI and AMSU-A RTMs to run the predictor module on CPUs in pipeline with the transmittance and radiance modules on NVIDIA many-core graphics processing units (GPUs). We show that concurrent execution of RTTOV-7 IASI RTM on CPU and GPU, in addition to asynchronous data transfer from CPU to GPU, allows the GPU accelerated code running on the 240-core NVIDIA Tesla C1060 to reach a speedup of 461x and 1793x for 1-GPU and 4-GPU configurations, respectively. To compute one day's amount of 1,296,000 IASI spectra, the CPU code running on the host AMD Phenom II X4 940 CPU core with 3.0 GHz will take 2.8 days. Thus, GPU acceleration reduced running time to a 8.75 and 2.25 minutes on 1-GPU and 4-GPU configurations, respectively. Speedup for the RTTOV AMSU-A RTM varied from 29x to 75x for single GPU and 4 GPUs, respectively. To further boost the speedup of a multispectral RTM, we developed a novel pure-GPU version of the RTTOV AMSU-A RTM where the predictor module also runs on GPU to achieve a 96% reduction in the host-to-device data transfer. The speedups for the pure-GPU AMSU-A RTM are significantly increased to 56x and 125x for 1-GPU and 4-GPU configurations, respectively.
منابع مشابه
Parallel Implementation of Particle Swarm Optimization Variants Using Graphics Processing Unit Platform
There are different variants of Particle Swarm Optimization (PSO) algorithm such as Adaptive Particle Swarm Optimization (APSO) and Particle Swarm Optimization with an Aging Leader and Challengers (ALC-PSO). These algorithms improve the performance of PSO in terms of finding the best solution and accelerating the convergence speed. However, these algorithms are computationally intensive. The go...
متن کاملUltra-Fast Image Reconstruction of Tomosynthesis Mammography Using GPU
Digital Breast Tomosynthesis (DBT) is a technology that creates three dimensional (3D) images of breast tissue. Tomosynthesis mammography detects lesions that are not detectable with other imaging systems. If image reconstruction time is in the order of seconds, we can use Tomosynthesis systems to perform Tomosynthesis-guided Interventional procedures. This research has been designed to study u...
متن کاملFast Cellular Automata Implementation on Graphic Processor Unit (GPU) for Salt and Pepper Noise Removal
Noise removal operation is commonly applied as pre-processing step before subsequent image processing tasks due to the occurrence of noise during acquisition or transmission process. A common problem in imaging systems by using CMOS or CCD sensors is appearance of the salt and pepper noise. This paper presents Cellular Automata (CA) framework for noise removal of distorted image by the salt an...
متن کاملImplementation of the direction of arrival estimation algorithms by means of GPU-parallel processing in the Kuda environment (Research Article)
Direction-of-arrival (DOA) estimation of audio signals is critical in different areas, including electronic war, sonar, etc. The beamforming methods like Minimum Variance Distortionless Response (MVDR), Delay-and-Sum (DAS), and subspace-based Multiple Signal Classification (MUSIC) are the most known DOA estimation techniques. The mentioned methods have high computational complexity. Hence using...
متن کاملSelf-Tuning Distribution of DB-Operations on Hybrid CPU/GPU Platforms
A current research trend focuses on accelerating database operations with the help of GPUs (Graphics Processing Units). Since GPU algorithms are not necessarily faster than their CPU counterparts, it is important to use them only if they outperform their CPU counterparts. In this paper, we address this problem by constructing a decision model for a framework that is able to distribute database ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013